Regular expressions for language engineering

نویسندگان

  • L. Karttunen
  • Jean-Pierre Chanod
  • Gregory Grefenstette
  • A. Schille
چکیده

Many of the processing steps in natural language engineering can be performed using nite state transducers An optimal way to create such transducers is to compile them from regular expressions This paper is an introduction to the regular expression calculus extended with certain operators that have proved very useful in natural language appli cations ranging from tokenization to light parsing The examples in the paper illustrate in concrete detail some of these applications

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TreeRegex: An Extension to Regular Expressions for Matching and Manipulating Tree-Structured Text (Technical Report)

Tree-structured text is ubiquitous in software engineering and programming tasks. However, despite its prevalence, users frequently write custom, specialized routines to query and update such text. For example, a user might wish to rapidly prototype a compiler for a domain-specific language by issuing successive transformations, or they might wish to identify all the call sites of a particular ...

متن کامل

Derivatives for Enhanced Regular Expressions

Regular languages are closed under a wealth of formal language operators. Incorporating such operators in regular expressions leads to concise language specifications, but the transformation of such enhanced regular expressions to finite automata becomes more involved. We present an approach that enables the direct construction of finite automata from regular expressions enhanced with further o...

متن کامل

A Specification Language for Reo Connectors

Recent approaches to component-based software engineering employ coordinating connectors to compose components into software systems. Reo is a model of component coordination, wherein complex connectors are constructed by composing various types of primitive channels. Reo automata are a simple and intuitive formal model of contextdependent connectors, which provided a compositional semantics fo...

متن کامل

Exploring EFL Learners’ Use of Formulaic Sequences in Pragmatically Focused Role-play Tasks

Communicative language use largely entails regular patterns consisting of pre-constructed phrases or sequences. These sequences have been examined by many researchers to find the situation-based formulas which may help L2 learners follow a possibly more target-like speaking system. This study, therefore, explored two categories of formulaic expressions including speech formulas and situation-bo...

متن کامل

Generating Optimal Monitors for Extended Regular Expressions

Ordinary software engineers and programmers can easily understand regular patterns, as shown by the immense interest in and the success of scripting languages like Perl, based essentially on regular expression pattern matching. We believe that regular expressions provide an elegant and powerful specification language also for monitoring requirements, because an execution trace of a program is i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Natural Language Engineering

دوره 2  شماره 

صفحات  -

تاریخ انتشار 1996